Can Voice User Interfaces Say “I”? An Experiment with Recorded Speech and TTS
نویسندگان
چکیده
How do people respond to voice user interfaces (VUIs) that use first-person pronouns as opposed to those that use thepassive voice? We addressed this question in the context of a phone-based auction via a 2 (first person vs. passive voice) X 2 (recorded vs. synthesized speech) between-participants experiment (N=48). There were significant cross-over interactions with respect to user relaxation, perception of system quality, and bidding behavior such that personal pronouns were responded to more positively than passive voice for recorded speech, while passive voice was responded to more positively for synthesized speech. We discuss implications for the theory and design of VUIs.
منابع مشابه
Rvoice Studio and Activeprompts
ActivePrompts are a new technology from Rhetorical, designed to offer a quicker and cheaper alternative to using voice talents and recording studios for the creation of an application-specific prompt library. Most speech user interfaces optimize the quality of their speech output by using pre-recorded prompts. In these applications, TTS is often considered to be a fall-back technology to genera...
متن کاملThe Impact of Auditory Embodiment on Animated Character Design
Advances in speech recognition and text-to-speech (TTS) technologies recently have contributed to the development of conversational interfaces that incorporate animated characters. These interfaces potentially are well suited for educational software, since they can engage children as active learners and support question asking skills. In the present research, a simulation study was conducted i...
متن کاملExperimental tools to evaluate intelligibility of text-to-speech (TTS) synthesis: effects of voice gender and signal quality
Two experiments are reported that constitute new methods for evaluation of text-to-speech (TTS) synthesis from the user’s perspective. Experiment 1, using sentence stimuli, and Experiment 2, using discrete word stimuli, investigate the effect of voice gender and signal quality on the intelligibility of three TTS synthesis systems from the user’s point of view. Accuracy scores and reaction time ...
متن کاملPreliminary experiments toward automatic generation of new TTS voices from recorded speech alone
To generate a new concatenative text-to-speech (TTS) voice from recordings of a human’s voice, not only recordings but also additional information such as the transcriptions, prosodic labels, and the phonemic alignments are necessary. Since some of the information depends on the speaking style of the narrator, these types of information need to be manually added by listening to the recordings, ...
متن کاملUnit selection based on voice recognition
In this paper, we describe a perceptual voice recognition method to improve the naturalness of synthesized speech for Mandarin Chinese text-to-speech (TTS) baseline system. As a large TTS speech corpus, speech data always has different acoustic properties for different data recording conditions. Speech data recorded under different conditions can finally influence the naturalness of synthesized...
متن کامل